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Abstract. Let A be a matrix whose entries are real i.i.d. centered random 
variables with unit variance and suitable moment assumptions. Then the 
smallest singular value s n (A) is of order n -1 / 2 with high probability. The 
lower estimate of this type was proved recently by the authors; in this note 
we establish the matching upper estimate. 

I. Introduction 

Let A be an n x n matrix whose entries are real i.i.d. centered random 
variables with suitable moment assumptions. Random matrix theory studies 
the distribution of the singular values Sk{A), which are the eigenvalues of 
\A\ — \J A* A arranged in the non-increasing order. In this paper we study the 
magnitude of the smallest singular value s n (A), which can also be viewed as 
the reciprocal of the spectral norm: 

(1) s n (A)= inf ||Ac|| 2 = l/||A- 1 ||. 

x: \\x\\2 = l 

Motivated by numerical inversion of large matrices, von Neumann and his 
associates speculated that 

(2) s n (A) ~ -nT 1 / 2 with high probability. 

(See [4], pp. 14, 477, 555). A more precise form of this estimate was con- 
jectured by Smale and proved by Edelman [I] for Gaussian matrices A. For 
general matrices, conjecture §Z§ had remained open until we proved in [2] the 
lower bound s n (A) = Vt{n~ l l 2 ). In the present paper, we shall prove the cor- 
responding upper bound s n (A) = 0(n~ 1//2 ), thereby completing the proof of 

(ED- 

Theorem 1.1 (Fourth moment). Let A be an n x n matrix whose entries 
are i.i.d. centered random variables with unit variance and fourth moment 
bounded by B. Then, for every 5 > there exist K > and n which depend 
(polynomially) only on 5 and B, and such that 

¥(s n (A) > Kn~ 1/2 ) < 5 for all n > n . 
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Remark. The same result but with the reverse estimate, P(s n (A) < Kn < 
5, was proved in [2]. Together, these two estimates amount to (J2]). 

Under more restrictive (but still quite general) moment assumptions, The- 
orem 11.11 takes the following sharper form. Recall that a random variable £ 
is called subgaussian if its tail is dominated by that of the standard normal 
random variable: there exists B > such that P(|£| > t) < 2 exp(—t 2 / B 2 ) for 
all t > 0. The minimal B is called the subgaussian moment of £. The class of 
subgaussian random variables includes, among others, normal, symmetric ±1, 
and in general all bounded random variables. 

Theorem 1.2 (Subgaussian). Let A be an n x n matrix whose entries are 
i.i.d. centered random variables with unit variance and subgaussian moment 
bounded by B. Then for every K > 2 one has 

(3) P(s„(A) > Kn' 1 ' 2 ) < (C/K) \ogK + c n , 

where C > and c G (0, 1) depend (polynomially) only on B. 

Remark. A reverse result was proved in [2]: for every e > 0, one has 
P(s„(A) < en' 1 ' 2 ) <Ce + c n . 

Our argument is an application of the small ball probability bounds and the 
structure theory developed in [2] and [3]. We shall give a complete proof of 
Theorem 11.21 only; we leave to the interested reader to modify the argument 
as in |2j to obtain Theorem ll.il 

2. Proof of Theorem 11.21 

By (efc)£ =1 we denote the canonical basis of the Euclidean space M. 71 equipped 
with the canonical inner product (•, •) and Euclidean norm ||-|| 2 . By C, Ci, c, ci, . 
we shall denote positive constants that may possibly depend only on the sub- 
gaussian moment B. 

Consider vectors (X^)^ =1 and (X^)^ =1 an n-dimensional Hilbert space H. 
Recall that the system (X^, Xl)^ =1 is called a biorthogonal system in H if 
(X*,X k ) = 5j t k for all j, k = l,...,n. The system is called complete if 
span(X fc ) = H. The following notation will be used throughout the paper: 

(4) H k := spanpTj)^, H jik := span(Xi)^ {iifc} , j, k = 1, . . . , n. 

The next proposition summarizes some elementary and known properties of 
biorthogonal systems. 

Proposition 2.1 (Biorthogonal systems). 1. Let A be an n x n invertible 
matrix with columns X k = Ae k , k = 1, . . . ,n. Define X^ = (A _1 )*efe. Then 
(X k , Xl)^ =1 is a complete biorthogonal system in R™. 
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2. Let (X k ) k=1 be a linearly independent system in an n- dimensional Hilbert 
space H. Then there exist unique vectors (X^)^ =1 such that (X k , X k ) k=l is a 
biorthogonal system in H . This system is complete. 

3. Let (X k , X£)k =1 be a complete biorthogonal system in a Hilbert space H. 
Then ||X||| 2 = l/dist(X fc , H k ) for k = 1, . . . ,n. □ 

Without loss of generality, we can assume that n > 2 and that A is a.s. 
invertible (by adding independent normal random variables with small variance 
to all entries of A). 

Let u, v > 0. By ([1]), the following implication holds: 

(5) 3x G R n : ||x|| 2 < u, ||y4 _1 x|| 2 > vn l/2 implies s n {A) < {u/v)n~ l/2 . 

We will now describe how to find such x. Consider the columns X k = Ae k of 
A and the subspaces H k , Hj k defined in 01]). Let Pi denote the orthogonal 
projection in W 1 onto Hi. We define the vector 

x := Xi - PiXi. 

Define XI = (A~ l )*e k . By Proposition [2TT1 (X k , X / !)g_ 1 is a complete biorthog- 
onal system in MJ 1 , so 

(6) ker(P x ) = span(X*). 

Clearly, ||x|| 2 = dist(X 1 ,P 1 ). Conditioning on Hi and using a standard 
concentration bound, we obtain 

(7) P(||x|| 2 > u) < Ce~ cu \ u > 0. 

This settles the first bound in ([5]) with high probability. 

To address the second bound in we write A~ x x = A~ l Xi — A~ l PiX\ = 
ei— A _1 P 1 X 1 . Since P\X\ G Hi, the vector A^PiXi is supported in {2, ... , n} 
and hence is orthogonal to e\. Therefore 

n 

WA^xWl > WA^PiXiWl = ^(A-'PiXi^k) 2 

k=l 

n n 

= ^(Pi^- 1 )^,^) 2 = J2(p^* k ,xi) 2 . 

k=l k=l 

The first term of the last sum is zero since PiX^ = by (jH]). We have proved 
that 

n 

(8) p-Vll^^r^Xi) 2 , where Y k * := P X X* G Hi, k = 2,...,n. 

k=2 

Lemma 2.2. (Y k * ,X k ) k=2 is a complete biorthogonal system in Hi. 
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Proof. By (ED and ©, Y* - X* G ker(Pi) = span(X*), so Y k * = X* - X k X* 
for some X k G K and all k — 2, . . . , n. By the orthogonality of Xj* to all of X k , 
k = 2, . . . , n, we have (Y*,X k ) = (X*, X k ) = Sj jk for all j, k = 2, . . . ,n. The 
biorthogonality is proved. The completeness follows since dim(if 1 ) = n— 1. □ 
In view of the uniqueness in Part 2 of Proposition 12. 1\ Lemma 12.21 has the 
following crucial consequence. 

Corollary 2.3. The system of vectors {Y k ) k=2 is uniquely determined by the 
system (X k ) k=2 . In particular, the system (Y k ) k=2 and the vector X\ are sta- 
tistically independent. □ 

By Part 3 of Proposition [O, \\Y^\\ 2 = l/dist(X fe , H 1)k ). We have therefore 
proved that 

n y* 

(9) IIA-^I^ > J2(a k /b k ) 2 , where a k = |(— -J— b k = dist(X fe , H 1Je ). 

We will now need to bound a k above and b k below. Without loss of gener- 
ality, we will do this for k = 2. 

We are going to use a result of [3] that states that random subspaces have 
no additive structure. The amount of structure is formalized by the concept 
of the least common denominator. Given parameters a > and 7 G (0, 1), the 
least common denominator of a vector a G M. n is defined as 

LCD a>7 (a) := inf {9 > : di$t{6a,Z N ) < rnm( 7 ||0a|| 2 , a)}. 

The least common denominator of a subspace H in IR n is then defined as 

LCD an (H) = inf{LCD Qi7 (a) : a G H, \\a\\ 2 = 1}. 

Since H\ % i is the span of n — 2 random vectors with i.i.d. coordinates, Theo- 
rem 4.3 of [3] yields that 

P{ LCD QiC ((/f li2 ) ± ) > e cn } > 1 - e~ cn 

where a = Cy/n, and c > is some constant that may only depend on the 
subgaussian moment B. 

On the other hand, note that the random vector X2 is statistically indepen- 
dent of the subspace i?i,2- So, conditioning on H\ 2 and using the standard 
concentration inequality, we obtain 

P(6 2 = dist(X 2 , Fi )2 ) >t)< Ce~ ct \ t > 0. 

Therefore, the event 

(10) 8 := { LCD QjC ((i/ li2 ) ± ) > e cn , b 2 < t] satisfies P(£) > l-e- cn -Ce~ ct \ 

Note that the event £ depends only on (Xj)™ =2 . So let us fix a realization 
of (Xj)™ =2 for which £ holds. By Corollary 12.31 the vector Y 2 * is now fixed. By 
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LemmaESl Y* is orthogonal to (Xj)] =3 . Therefore Y* := Y*/\\Y*\\ 2 £ (H h2 ) L , 
and because event £ holds, we have 

LCD Q , c (n > e cn . 

Let us write in coordinates a 2 = |(F*,Xl)| = | an d recall 

that Y*(i) are fixed coefficients with X^i=i^*(^) 2 — 1> and X\(i) are i.i.d. 
random variables. We can now apply Small Ball Probability Theorem 3.3 of 
[3] (in dimension m — 1) for this random sum. It yields 

(11) Px> 2 < e) < C(e + 1/LCD Q)C (F*) + e" Cl ") < C(e + e~ C2 ™). 

Here the subscript in F Xl means that we the probability is with respect to the 
random variable X\ while the other random variables (Xj)™ =2 are fixed; we 
will use similar notations later. 

Now we unfix all random vectors, i.e. work with P = V > x 1 ,...,x n - We have 

P(a 2 < e or b 2 > t) = E X2> ... iXn P Xl (a 2 < e or b 2 > t) 

< E X2 _ Xn l £ F Xl (a 2 <e)+ F X2 _ Xn (S c ) 

because b 2 < t on £. By f fTTT) and ffTU]) . we continue as 

P(a 2 < e or b 2 > t) < C{e + e~ C2n ) + (e~ cn + Ce'^) 

= die + e-^ 2 +e~ cn ) :=p(e,t,n). 

Repeating the above argument for any k e {2, . . . ,n} instead of k — 2, we 
conclude that 

(12) P(a fc /6 fc < eft) <p(e,t,n) for e > 0, t > 0, k = 2, . . . ,n. 

From this we can easily deduce the lower bound on the sum of (a*:/&fc) 2 > which 
we need for (Q. This can be done using the following elementary observation 
proved by applying Markov's inequality twice. 

Proposition 2.4. Let Zf. > 0, k = 1, . . . , n, be random variables. Then, for 
every e > 0, we have 

fc=l k=l 

We use Proposition 12.41 for Z k = (a^/^) 2 , along with the bounds (|T2|) . In 
view of ([9]), we obtain 

(13) P(||A -1 a;||2 < (s/t)n 112 ) < 2p{4e,t,n). 

Estimates (E3) and ( ITBl settle the desired bounds in ([5]), and therefore we 
conclude that 

F(s n (A) < (ut/e)n- 1/2 ) > V(\\x\\ 2 < u, {{A^xh > (e/t)n 1/2 ) 

> 1 - Ce- cuZ -2p(4e,t,n). 
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This estimate is valid for all e,u,t > 0. Choosing e = 1/ K, u = t = y/log K, 
the proof of Theorem 11.21 is complete. □ 
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